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Abstract 


Many  current  algorithms  for  nonlinear  constrained  optimization 
problems  determine  a  direction  by  solving  a  quadratic  programming 
subproblem.  The  global  convergence  properties  are  addressed  by  using  a  line 
search  technique  and  a  merit  function  to  modify  the  length  of  the  step 
obtained  from  the  quadratic  program. 

In  unconstrained  optimization,  trust  regions  strategies  have  been  very 
successful.  In  this  paper  we  present  a  new  approach  for  equality  constrained 
optimization  problems  based  on  a  trust  region  strategy.  The  direction 
selected  is  not  necessarily  the  solution  of  the  standard  quadratic 
programming  subproblem. 
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1.  Introduction.  Consider  the  equality  constrained  optimization 
problem  : 

minimize  f(z) 

(NLE)  subject  to  g(x)  =  0, 

i 

where  /  :  Rn  -*■  R  and  g  :  Rn  -»•  Rm  (m^n).  It  is  assumed  that  the 
problem  functions  are  at  least  twice  continuously  differentiable,  that  a 
solution  exists,  and  that  Vp(x)  has  full  rank. 

Several  authors  including  Fletcher  [2],  Gay  [3],  and  Sorensen  [12], 
have  considered  a  trust  region  approach  for  optimization  problems  with 
linear  constraints.  From  a  theoretical  point  of  view,  the  extension  from 
unconstrained  optimization  to  linearly  constrained  optimization  is 
somewhat  straightforward;  one  merely  focuses  attention,  on  the  subspace 
of  interest.  For  nonlinear  constraints  the  extension  is  not  at  all  clear. 
The  main  attempt  in  this  area  has  been  Vardi  [14].  While  this  work 
contains  some  interesting  results,  it  leaves  several  important  questions 
unanswered.  Our  objective  is  to  develop  an  effective  trust  region 
algorithm  for  problem  NLE. 


2.  Motivation  for  Our  Approach.  One  of  the  more  successful  methods 
for  solving  problem  NLE  is  the  successive  quadratic  programming  (SQP) 
approach  where,  at  each  iteration,  the  step  is  calculated  as  the  solution 
of  the  quadratic  programming  problem  : 

minimize  qQp(s)  =  VyLcx.X)7^  4-  sTBs 
(QP'  subject  to  g{x)  +  Vg(x)Ts  =  0  , 

where  VJLt x,X)  is  the  gradient  of  the  Lagrangian  function 

L(x,X)  =  /<x)  +  \Tg{x), 

\€Rm,  and  B  is  an  approximation  to  Vl^x.X).  The  step  for  the 
multiplier  X  is  obtained  -as  the  multiplier  associated  with  the  solution  of 
problem  QP. 

The  most  natural  way  to  introduce  the  trust  region  idea  is  to  add  a 
constraint  which  restricts  the  size  of  the  step  in  problem  QP,  see  Vardi 
[14].  However,  this  approach  may  lead  to  inconsistent  constraints,  and 
it  is  not  clear  how  to  overcome  this  problem.  Instead  of  adding  the  trust 
region  constraint  to  the  standard  QP  problem,  we  consider  adding  it  to  a 
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somewhat  different  problem. 

Suppose  we  want  to  solve  g{x)= 0  using  a  standard  trust  region 
method.  We  have  a  current  point  xc  and  a  bound  Ac  on  the  length  of  the 
step  we  are  willing  to  take  from  xc.  At  each  iteration  the  step  is 
calculated  by  solving  : 

minimize  ~\\  g(xc)  +  Vg(xc)Ts  Ilf 
subject  to  II  s  llg  ^  Ac 

where  II  ll2  denotes  the  2-norm. 

If  the  algorithm  simply  took  the  "best  steepest  descent  step",  i.e. 
the  Cauchy  step  sCP,  then  under  reasonable  assumptions  global 
convergence  can  be  demonstrated,  see  Powell  [10],  Mord  and 
Sorensen  [7],  and  Schultz,  Schnabel  and  Byrd  [11].  That  is,  as  long  as 
the  step  s  satisfies 

\\g(xc)+Vg(xc)Ts\\l  £  ll^+V^x/scpIlf, 

convergence  to  a  solution  of  g{x)  =  0  is  obtained.  This  fact  is  the  basis 
for  our  approach. 

Define  the  set  Y  as 

Y  =  \s  :  ||s||2  ^  Ac  and 

lig(xc)-4-V^(xc)rsl||  ^  \\g{xc)+Vg{xc)TsCP\\l  \  . 

That  is,  Y  is  the  set  of  steps  from  xc  that  are  inside  the  trust  region  and 
give  at  least  as  much  descent  on  the  2-norm  of  the  residuals  of  the 
linearized  constraints  as  the  Cauchy  step,  (see  Figure  1).  By  choosing 
any  ^point  in  Y  we  will  generate  a  sequence  which  is  guaranteed  to 
converge  to  a  feasible  point.  We  take  advantage  of  this  freedom  by 
choosing  an  s  which  minimizes  a  quadratic  model,  g(s),  of  the  objective 
function  /  over  Y.  The  step  is  calculated  by  solving  the  problem  : 

minimize  qc(s) 
subject  to  II  s  Il2  ^  Ac 

II  g&c)  +  c)Ts  Hi  ^  0 c . 

where  qc(s)  is  a  quadratic  approximation  to  the  function  /  and 

9C  =  II  Sf(*. c)  +  Vg&cfscp  Hi- 
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Figure  l 


3.  Theory.  We  consider  the  problem  : 

minimize  q(s)  =  a rs  +  jStBs 
(QPQ)  subject  to  I!  s  II2  ^  A 

II  g(x)  +  Vg(z)rs  Ilf  ^  9  , 

where  a€Rn  and  B€ Rnxn  is  symmetric  and  nonsingular.  Problem  QPQ  is 
the  basis  of  our  trust  region  approach  to  equality  constrained 
minimization.  Its  solution  is  given  by  the  following  lemma. 
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Lemma  3.1  Problem  QPQ  is  solved  by  : 


s(ji, 77)  =  -  (B  +  fjJ  +  V Vg(x)VsK2)r)_1  (a  +  Vg{x)rjg(x)) 

for  fi,ri  ^0  such  that  \\s(ji,r))\\z  -  A  and  ||  g(x)  +  Vg(x)Ts  II \  =  6,  un¬ 
less  : 

IIS(z)+Vg(x)rs(0,0)||f  <  0,  and  lls(0,0)|||  <  A  in  which  case 
s(0,0)  is  the  solution, 
or 

||y(x)+Vsf(x)rsOLi,0)|li  <  9,  and  lis(>Lt,0)l ||  =  A  in  which  case 
s(/i,  0)  is  the  solution, 
or 

\\g(x)+Vg(x)Ts(0,7i)\\l  =  9,  and  ||s(0,77)|||  <  A  in  which  case 
s(0,77)  is  the  solution 

Proof.  The  proof  is  a  straightforward  application  of  the  necessary- 
conditions  of  constrained  optimization.  □ 

By  defining  a  and  B  in  various  ways  we  can  &5w  now  the  solution  to 
problem  QPQ  is  related  to  existing  theory.  The  following  theorem  shows 
that  if  the  quadratic  model  q{s)  is  the  Taylor  expansion  of  /,  and  the 
trust  region  constraint  is  not  binding,  then  our  step  is  the  Newton  step 
on  the  standard  penalty  function  with  penalty  constant  77.  It  is 
important  to  note  that  77  is  not  a  free  parameter,  but  is  determined  by 
the  solution  to  problem  QPQ. 


THEOREM  3.1  Let  a  =  V/(x)  and  B  =  V2/(x).  If  V2/(x)  is  nonsingu- 
lar  and  A  is  such  that  the  constraint  ||s||  ^  A  is  not  binding,  then 
s(0 ,77)  is  the  Newton  step  for  the  standard  penalty  function 

P(x)  =  /(X)  +  |-77 g(x)Tg{x) . 

Moreover,  if  V2/(x)  is  positive  definite,  then  for  any  (i  ^  0,  s(fi,r))  is  a 
descent  direction  for  P(x). 

Proof.  The  proof  of  the  first  part  is  straightforward  from  the 
definition  of  the  Newton  step  for  minimizing  a  function.  Details  can  be 
found  in  section  5.5  of  Dennis  and  Schnabel  [l].  In  this  case 
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s(0,77)  =  -  (V2/(Z)  +  'nVg(x)Vg(x)T)~1(Vf(z)  +  Vgwngix)). 

To  prove  that  s(fx,r})  is  a  descent  direction  for  P,  it  is  sufficient  to  show 
that  VP(i)rs(/i,77)  <  0.  Noting  that  /z  and  77  are  nonnegative,  V2/( x)  is 
positive  definite,  and  VP(x)  =  V/(x)  +  Vg(x)r)g{x),  we  have 

1 

VP(x)Ts(fi,r})  =  -  VP(x)r(V2/(x)  +  ful  +  T}Vg(x)Vg(x)T)~lVP(x)  <  0.  □ 

Now  we  show  that  if  q(s)  is  the  Taylor  expansion  of  the  Lagrangian 
function,  then  the  step  that  solves  problem  QPQ  is  the  Newton  step  on 
the  augmented  Lagrangian  function 

AL{x,  X)  =  /( 2)  +  XTg(  x)  +  ±r)g(x)Tg(x). 

Again,  it  is  important  to  note  that  the  penalty  constant  is  determined  by 
the  solution  to  problem  QPQ. 

THEOREM  3.2  Let  a  =  VJL{x, A)  and  B  =  Vl^z.X).  If  V%cL(x,X)  is 
nonsingular  and  A  is  such  that  the  constraint  ||si!  ^  A  is  not  binding, 
then  s(0,77)  is  the  Newton  step  for  the  augmented  Lagrangian  More¬ 
over,  if  Vy,(z,A)  is  positive  definite,  then  for^ar^  /z  ^  0,  s{fi,rj)  is  a 
descent  direction  for  AL(x,X). 


Proof  The  proof  is  analogous  to  the  proof  of  the  previous 
theorem.  □ 

We  have  shown  how  our  approach  relates  to  the  standard  penalty 
function  and  the  augmented  Lagrangian.  It  is  also  possible  to  relate  the 
solution  to  problem  QPQ  to  sQP,  the  solution  of  problem  QP.  We  know 

SqP  =  -B~l  (V/(z)  +  Vy(x)X)  , 

where 

X  =  {Vg(x)TB-1^g(x))~i{g{x)-^g(x)TB~iVf{x))  . 

See  Tapia  [13]  for  details  and  background  material.  The  following 
theorem  shows  that  one  should  not  expect  the  solutions  of  problems 
QPQ  and  QP  to  be  the  same.  It  is  reasonable  to  compare  solutions  of  the 
two  problems  only  in  the  case  that  the  trust  region  constraint  in  problem 
QPQ  is  not  binding. 
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THEOREM  3.3  Let  a  =  VJ^x.A),  and  A  be  such  that  the  constraint 
||s||  £  A  is  not  binding.  Then  the  solution  of  problem  QPQ  is  the  solu¬ 
tion  to  problem  QP  if  and  only  if  the  unconstrained  minimizer  of  the 
quadratic,  q(s),  satisfies  linearized  constraints. 


Proof.  First  we  will  assume  that  sQP  =  s(fi,rj)  and  show  that  the 
unconstrained  minimizer  of  q(s)  satisfies  linearized  constraints. 

Since  SqP  solves  problem  QP,  it  satisfies  linearized  constraints,  i.e., 

g(x)  +  Vg(x)TsQP  =  0  . 

Given  that  s(/z,r /)  =  SqP,  we  have 

g(x)  +  Vg(x)Ts(jJ,,r))  -  0  . 

If  0  =  0,  we  observe  that  problems  QP  and  QPQ  are  equivalent, 
therefore,  we  consider  0  >  0.  Since  0  >  0  the  constraint 

\\g(x)  +  Vy(x)rsltl  £  0  , 

is  not  binding  and  the  multiplier  r\  associated  with  this  constraint  is  0. 
Since  l|s||  £  A,  is  assumed  not  to  be  a  binding  constraint,  the  solution 
to  problems  QP  and  QPQ  is  _ 

s(0,0)  =  -£_1  (VJL(xjx)) 

which  is  the  unconstrained  minimizer  of  q{s). 

Next  we  must  show  that  if  the  unconstrained  minimizer  of  g(s) 
satisfies  linearized  constraints,  then  sQP  =  The  result  follows  from 

the  fact  that  in  this  case  both  problems  become  the  same  unconstrained 
minimization  problem.  □ 

As  we  progress  through  the  iterations  of  our  algorithm  we  should 
expect  to  have  A  large  and  0-» 0.  Clearly  for  A  sufficiently  large  we  will 
havept  =  0.  Also  from  Theorem  3.3  we  are  led  to  conjecture  that  77-^00  as 
0-+O.  Hence,  we  are  interested  in  the  behavior  of  the  solution  of  problem 
QPQ  as  77-*«>  and  /u^0.  The  following  theorem  gives  us  this  behavior, 
which  can  be  viewed  as  a  form  of  consistency.  Namely,  while  the  solution 
of  problem  QP  and  problem  QPQ  are  in  general  never  the  same;  as  77~»«> 
and  fi-> 0  the  solution'  of  problem  QPQ  approaches  the  solution  of 
problem  QP.  Thus  we  should  expect  our  algorithm  to  eventually  generate 
steps  which  are  arbitrarily  close  to  the  SQP  step.  In  practice  we  have 
found  this  to  be  the  case.  These  comments  are  the  subject  of  the 
following  theorem. 


THEOREM  3.4  Let  a  =  VJL(x,\),  and  B  be  positive  definite.  Then 


lim  SQl.T))  =  Sqp. 

O.09) 


Proof.  To  prove  this  theorem  we  need  to  obtain 
(5+Ai/+r?V^(i)Vy(z)0-1.  By  the  Sherman-Morrison-Woodbury  formula, 
see  page  50  of  Ortega  and  Rheinboldt  [8],  we  have 


(B+/i/+T?Vy(z)Vy(j)7’)“1  =  {B+fil)-1  -r)(B+tJ.I)-1Vg{x) 

[I+r)Vg(x)(B+/j.I)-lVg(x)]~1Vg(x)T{B+jj,I)~l. 


Therefore, 

sQx.rj)  =  -  ((B+At/)+7?V5(2)Vg(z)r)_1(V^(^A)+Vsr(z)77g(i)) 

=  -  (fl+/z/)_1(/  “  VsKx)[i-/+Vg(x)7'(B+M/)"1V5(x)]"1Vy(x)r(B+M/)“1) 
(V/fx.X)  +  Vfif(x)775f(x)) 

=  -  (B+At/)“1(VIi(x.X)+V^(x)[i-/+Vsr(x)7'(B+M/)~1V^(x)]-r 

(g{x)~Vg{x)T(B+fxI)~lVJj{x,\))  ) 

Taking  limits  as  7]  -*«>  and  /x->0  we  have 

lim  s(Ai,77)  =  -B-1(VxL(x,X)+Vy(x)[Vff(x)rB~1Vg(x)]-1 

Tm)-*  (°.°°) 

(  5(x)-Vsf(x)rB~1VsL(x,X))  )  . 

It  is  straightforward  to  see  that  by  substituting  V/(x)+Vg(x)X  for  V^x.X) 
we  obtain  Sqp.  □ 


4.  Numerical  Results.  In  order  to  study  the  effectiveness  of  our 
approach  from  arbitrary  starting  points,  we  produced  a  preliminary 
implementation.  Problem  QPQ  was  solved  by  a  modification  of  the 
iterative  process  that  was  first  suggested  for  nonlinear  least  squares  by 


-7- 


Hebden  [4]  and  Mor6  [6].  For  our  quadratic  objective  function  we  choose 
g(s)  =  V/(x)rs  +  lsTV2/(*)s  with  no  multiplier  approximations.  Although 

the  algorithm  is  not  completely  defined,  we  wanted  to  obtain  some  feel 
for  the  robustness  of  the  approach.  For  this  we  compared  our  method, 
SQPQC,  with  an  SQP  approach,  VF02AD  by  Powell  [9J,  which  is  available 
in  the  Harwell  Subroutine  Library.  = 

We  now  list  a  subset  of  our  test  problems.  These  problems  are 
referenced  and  can  be  found  in  Hock  and  Schittkowski  [5}.  The  number 
in  parentheses  denotes  the  number  given  to  this  problem  in  [5},  n  is  the 
number  of  variables  in  the  problem,  and  tn  is  the  number  of  equality 
constraints. 

Problem  1  (60)  n  —  3,  m  =  1 
/(i)  =  (xj  -  l)2  +  fa  -  x2)2  +  (x2  -  x3)4 

<7i(x)  =  Xi(l  +  xf)  +  X3  —  4  —  3(2)T 
xt  «  (1.1048,  1.1966,  1.5352) 


Problem  2  (77)  n  =  5,  m  =  2 
/(x)  =  (xx  -  l)2  +  (xt  -  x2)2  +  (x2  -  x3)2  +  (x4  -  l)4  +  (x5  -  l)6 

gx{x)  =  x?x4  +  sin(x4  -  x5)-2(2)7 

g2{x)  =  x2  +  x§xj  -  8  -  (2)* 

x  »  (1.1661,  1.1821,  1.3802,  1.5060,  0.6109) 


Problem  3  (79)  n  =  5,  m  =  3 
/(X)  =  (i!  -  l)2  +  (xj  -  x2)2  +  (x2  -  x3)2  +  (x4  -  l)4  +  (x5  -  l)4 

^(x)  =  xj  +  xf  +  x§  -  2-3(2)* 

<72(x)  =  x2  +  x2  +  x4  +  2  -  2(2)7 
gz{x)  =  XiX5  -  2 

x  *  (1.1911,  1.3626,  1.4728,  1.6350,  1.6790) 
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Problem  4  (78)  n  =  5,  m  —  3 

f(X)  =  X&F2P4X5 

9\(x)  =  ^  +  +  ^  +  ^  + 

92 (*)  =  *2X3  “  5*4^5 
9z(x)  =  x?  +  +  1 

X  w  (-1.7171,  1.5957,  1.8272,  -0.7636,  -0.7636) 


The  results  from  this  subset  of  test  problems  are  reported  in 
Table  1.  The  column  labeled  Convergence  indicates  whether  or  not 
convergence  was  obtained,  and  the  number  in  parentheses  indicates  the 
number  of  iterations  the  algorithm  took  to  converge.  This  number  does 
not  give  meaningful  comparisons  for  many  reasons,  including  the  fact 
that  the  algorithm  is  only  in  a  preliminary  stage.  We  have,  however, 
included  it  for  completeness. 

Although  the  number  of  problems  is  small,  it  can  be  seen  that 
SQPQC  converges  for  all  the  problems  that  VF02AD  converges.  We  have 
found  several  problems  where  the  Imesefwefcfettt-ae  in  VF02AD  fails,  and 
thus  halts,  but  our  trust  region  rouuner-is  successful.  For  example, 
problem  2  with  starting  point  (10,  10,  10,  10,  10).  At  the  first  iteration 
in  VF02AD,  the  line  search  routine  fails  to  locate  a  better  point. 
Whereas,  our  trust  region  routine  succeeds  in  finding  a  next  iterate  and 
proceeds  to  find  the  solution. 


5.  Concluding  Remarks.  We  have  presented  a  framework  for  a  trust 
region,  approach  for  solving  equality  constrained  optimization,  problems. 
At  each  iteration  the  subproblem  we  solve  is  not  in  general  the  successive 
quadratic  programming,  (SQP),  subproblem.  We  have  motivated  the 
conjecture  that  asymptotically  our  step  is  the  same  as  the  step  produced 
by  solving  the  SQP  subproblem. 

The  theoretical  results  presented  in  this  paper,  although 
preliminary,  have  established  important  links  between  the  step  selection 
process  and  several  widely  used  merit  functions.  We  have  shown  that  the 
step  we  obtain  is  a  descent  direction  on  either  the  standard  penalty 
function  or  the  augmented  Lagrangian  function,  where  each  penalty 
constant  is  provided  by  the  solution  to  the  associated  subproblem. 

A  preliminary  implementation  of  our  approach  has  produced  good 
numerical  results.  These  numerical  results,  and  the  preliminary  theory. 
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lead  us  to  believe  that  our  approach  is  worthy  of  continued  research. 
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